Code
source('dependencies.R')
knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)
#install.packages("stargazer")
library(stargazer)Theresa Szczepanski
October 22, 2023
The Massachusetts Education Reform Act in 1993 was passed in the context of a national movement toward education reform throughout the United States. As early as 1989 there were calls to establish national curriculum standards as a way to improve student college and career readiness skills and close poverty gaps (Greer 2018). Massachusetts Comprehensive Assessment System (MCAS) tests were introduced as part of the Massachusetts Education Reform Act.
The MCAS tests are a significant tool for educational equity. Scores on the Grade 10 Math MCAS test “predict longer-term educational attainments and labor market success, above and beyond typical markers of student advantage” and differences among students are largely and “sometimes completely accounted for” by differences in 10th grade MCAS scores and educational attainments. (Papy 2020).
With the introduction of the new Common Core standards and accountability testing came the demand for aligned curricular materials and teaching practices. Research indicates that the choice of instructional materials can have an impact “as large as or larger than the impact of teacher quality” (Chingos 2012). Massachusetts, along with Arkansas, Delaware, Kentucky, Louisiana, Maryland, Mississippi, Nebraska, New Mexico, Ohio, Rhode Island, Tennessee, and Texas belongs to the Council of Chief State School Officers’ (CCSO), High Quality Instructional Materials and Professional Development network which aims to close the “opportunity gap” among students by ensuring that every teacher has access to high-quality, standards aligned instructional materials and receives relevant professional development to support their use of these materials (Chief State School Officers 2021).
All Massachusetts Public School students must complete a High School science MCAS exam providing a wealth of standardized data on students’ discipline specific skill development. All schools receive annual summary reports on student performance. Significant work has been done using the MCAS achievement data and the Student Opportunity Act to identify achievement gaps and address funding inequities across the Commonwealth (Papy 2020). With funding gaps outlined in the late 1990’s closing, one could consider how the MCAS data could be leveraged to support the state’s current high quality instructional materials initiatives. The state compiles school’s performance disaggregated by each MCAS question item (DESE 2022).
Using the curricular information provided in state wide Next Generation MCAS High School Introductory Physics Item reports together with school-level student performance data, we hope to address the following broad questions:
Is there a relationship between differences in a school’s performance across Science Practice Categories and a school’s overall achievement on the Introductory Physics exam?
How can trends in a school’s performance be used to provide schools with guidance on discipline-specific curricular areas to target to improve student achievement?
In this report, I will analyze the High School Introductory Physics Next Generation Massachusetts Comprehensive Assessment System (MCAS) tests results for Massachusetts public schools.
Data for the study were drawn from DESE’s Next Generation MCAS Test Achievement Results statewide report, Item Analysis statewide report, and the MCAS digital item library. The Next Generation High School Introductory Physics MCAS assessment consists of 42 multiple choice and constructed response items that assess students on Physical Science standards from the 2016 STE Massachusetts Curriculum Framework in the content Reporting Categories of Motions and Forces, MF, Energy, EN, and Waves, WA. Each item is associated with a specific content standard from the Massachusetts Curriculum Framework as well as an underlying science Practice Category of Evidence Reasoning and Modeling, ERM, Mathematics and Data, MD, or Investigations and Questioning, IQ. The State Item Report provides the percentage of points earned by students in a school for each item as well as the percentage of points earned by all students in the state for each item.
The HSPhy_NextGen_SchoolSum data frame contains summary performance results from 112 public schools across the commonwealth on the Next Generation High School Introductory Physics MCAS, which was administered in the Spring of 2022 and 2023. 87 schools tested students in both years and 25 schools only tested students in 1 of the 2 testing years, with 27,745 students completing the exam.
For each school, there are values reported for 44 different variables which consist of information from three broad categories
School Characteristics: This includes the name of the school and the size of the school, School Size, as determined by the number of students that completed the MCAS exam.
Discipline-Specifc Performance Metrics: This includes the percentage of points earned by students at a school for items each content Reporting Category, MF%, EN%, WA% and science Practice Category ERM%, MD%, IQ%, the difference between a school’s percentage of points earned compared to the percentage of points earned by all students in the state (MFDiff, ENDiff, etc…), and the variability in a school’s performance relative to the state by category as measured by the standard deviation of the school’s Diff across categories (SD MF Diff, SD EN Diff, etc…).
Aggregate Performance Level metrics: This includes a school’s percentage of students at each of the four Performance Levels, (E%: Exceeding Expectations, M%: Meeting Expectations, PM%: Partially Meeting Expectations, and NM%: Not Meeting Expectations), the difference between these percentages and the percentage of students in Massachusetts at each performance level (EDiff, MDiff, PMDiff, NMDiff), and an ordinal classification of school’s, EM Perf Stat based on the percentage of students that were classified as Exceeding or Meeting expectations on the exam (HighEM, HighM, Mid, Mid-Low, Low).
See the HSPhy_NextGenMCASDF data frame summary and codebook for further details about all variables.
A school’s percentage of students classified as Exceeding expectations on the Introductory Physics MCAS is negatively associated with a school’s variance in performance relative to students in the state on Mathematics and Data items, SD MD Diff.
A school’s summary performance on items in a given content Reporting Category as measured by MF%, EN%, and WA%, is positively associated with the Reporting Category's weight within the exam.
#HSPhy_NextGen_SchoolSum
HSPhy_NextGen_SchoolSum<-HSPhy_NextGen_SchoolSum%>%
ungroup()
#HSPhy_NextGen_SchoolSum
# HSPhy_NextGen_PerfDF
# HSPhy_NextGen_SchoolIT301DF
HSPhy_2023_SchoolSizeDF<-read_excel("data/2023_Physics_NextGenMCASItem.xlsx", skip = 1)%>%
select(`School Name`, `School Code`, `Tested`)%>%
mutate(`Tested` = as.integer(`Tested`))%>%
select(`School Name`, `School Code`, `Tested`)
HSPhy_2022_SchoolSizeDF<-read_excel("data/2022_Physics_NextGenMCASItem.xlsx", skip = 1)%>%
select(`School Name`, `School Code`, `Tested`)%>%
mutate(`Tested` = as.integer(`Tested`))%>%
select(`School Name`, `School Code`, `Tested`)
HSPhy_SchoolSize <- rbind(HSPhy_2023_SchoolSizeDF, HSPhy_2022_SchoolSizeDF)%>%
mutate(count = 1)%>%
group_by(`School Name`, `School Code`)%>%
summarise(count = sum(count),
`Tested` = sum(`Tested`))%>%
mutate(`Tested Count` = round(`Tested`/count))%>%
ungroup()
#HSPhy_SchoolSize
quantile <- quantile(HSPhy_SchoolSize$`Tested Count`)
HSPhy_Size<-HSPhy_SchoolSize%>%
mutate(`School Size` = case_when(
`Tested Count` <= quantile[2] ~ "Small",
`Tested Count` > quantile[2] &
`Tested Count` <= quantile[3] ~ "Low-Mid",
`Tested Count` > quantile[3] &
`Tested Count` <= quantile[4] ~ "Upper-Mid",
`Tested Count` > quantile[4] &
`Tested Count` <= quantile[5] ~ "Large",
))%>%
mutate(`School Size` = recode_factor(`School Size`,
"Small" = "Small",
"Low-Mid" = "Low-Mid",
"Upper-Mid" = "Upper-Mid",
"Large" = "Large",
.ordered = TRUE))%>%
select(`School Name`, `School Code`, `School Size`)
#HSPhy_Size
HSPhy_NextGen_SchoolSum<-HSPhy_NextGen_SchoolSum%>%
left_join(HSPhy_Size, by = c("School Name" = "School Name", "School Code" = "School Code"))%>%
mutate(`EMDiff` = `EDiff` + `MDiff`)%>%
mutate(`EM Perf Stat` = case_when(
`EDiff` > 0 & `EDiff` + `MDiff` > 0 ~ "HighEM",
`EDiff` <= 0 & `EDiff` + `MDiff` > 0 ~ "HighM",
#`EMDiff` > quantile(HSPhy_NextGen_SchoolSum$`EMDiff`)[3] &
`EMDiff` <= 0 & `EMDiff` > -14 ~ "Mid",
`EMDiff` <= -14 & `EMDiff` >= -33 ~ "Mid-Low",
`EMDiff` < -33 ~ "Low"
))%>%
mutate(`EM Perf Stat` = recode_factor(`EM Perf Stat`,
"HighEM" = "HighEM",
"HighM" = "HighM",
"Mid" = "Mid",
"Mid-Low" = "Mid-Low",
"Low" = "Low",
.ordered = TRUE))
HSPhy_NextGen_SchoolSum| Variable | Stats / Values | Freqs (% of Valid) | Graph | Missing | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Subject [character] | 1. PHY |
|
0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| School Name [character] |
|
|
0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| School Code [character] |
|
|
0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| EN% [numeric] |
|
50 distinct values | 0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| MF% [numeric] |
|
48 distinct values | 0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| WA% [numeric] |
|
47 distinct values | 0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| EN Diff SD [numeric] |
|
108 distinct values | 0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| MF Diff SD [numeric] |
|
106 distinct values | 0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| WA Diff SD [numeric] |
|
106 distinct values | 0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| IQ% [numeric] |
|
55 distinct values | 0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| MD% [numeric] |
|
48 distinct values | 0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ERM% [numeric] |
|
49 distinct values | 0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| None% [numeric] |
|
48 distinct values | 0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| IQ Diff SD [numeric] |
|
66 distinct values | 10 (8.9%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| MD Diff SD [numeric] |
|
107 distinct values | 0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ERM Diff SD [numeric] |
|
101 distinct values | 0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| None Diff SD [numeric] |
|
107 distinct values | 0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Tested Students [integer] |
|
95 distinct values | 0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| E% [numeric] |
|
29 distinct values | 0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| M% [numeric] |
|
50 distinct values | 0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| PM% [numeric] |
|
53 distinct values | 0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| NM% [numeric] |
|
44 distinct values | 0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| E%State [numeric] | 1 distinct value |
|
0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| M%State [numeric] | 1 distinct value |
|
0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| PM%State [numeric] | 1 distinct value |
|
0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| NM%State [numeric] | 1 distinct value |
|
0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| EDiff [numeric] |
|
29 distinct values | 0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| MDiff [numeric] |
|
50 distinct values | 0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| PMDiff [numeric] |
|
53 distinct values | 0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| NMDiff [numeric] |
|
44 distinct values | 0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| EN%State [numeric] | 1 distinct value |
|
0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| MF%State [numeric] | 1 distinct value |
|
0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| WA%State [numeric] | 1 distinct value |
|
0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| IQ%State [numeric] | 1 distinct value |
|
0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| MD%State [numeric] | 1 distinct value |
|
0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ERM%State [numeric] | 1 distinct value |
|
0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| None%State [numeric] | 1 distinct value |
|
0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| School Size [ordered, factor] |
|
|
0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| EMDiff [numeric] |
|
64 distinct values | 0 (0.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| EM Perf Stat [ordered, factor] |
|
|
0 (0.0%) |
Generated by summarytools 1.0.1 (R version 4.2.2)
2023-11-13
To explore the relationship between the distribution of school’s students’ Performance Level and school’s performance in content categories, we examine the percentage of points earned by students at schools as well as the standard deviation of the difference between points earned by students at a school and points earned by students in the state across Reporting Categories and Practice Categories. We grouped schools by their EM Perf Stat, an ordinal variable classifying schools by the percentage of students they have that were classified as either Exceeding or Meeting expectations on the MCAS. These numbers seem to suggest that items classified with the Science Practice Category of Mathematics and Data seem to be more challenging to students than those classified as Evidence, Reasoning, and Modeling. These practice categories are strongly and equally emphasized within the exam; items tagged with these categories account for 82% of the available points on the exam with exactly 41% of available points coming from each category.
When considering content Reporting Categories, there do not seem to be discernible distinctions between EM Perf Stat and school’s achievement and performance across categories. All schools seem to perform the strongest on Motion and Forces items, followed by Energy, and weakest on Waves items. Notably, this is also the order of the relative weights of the content areas within the exam; MF, EN, and WA items account for 50%, 30%, and 20% of exam points respectively.
When examining the statewide performance distribution, we can see from the right-skew that it is rare for schools to have high percentages of students classified as Not Meeting expectations and even rarer for schools to have high percentages of students classified as Exceeding expectations.
HSPhy_NextGen_SchoolSum%>%
select(`E%`, `M%`, `PM%`, `NM%`)%>%
pivot_longer(c(1:4), names_to = "Performance Level", values_to = "% Students")%>%
ggplot( aes(x=`% Students`, color=`Performance Level`, fill=`Performance Level`)) +
geom_histogram(alpha=0.6, binwidth = 15) +
scale_fill_viridis(discrete=TRUE) +
scale_color_viridis(discrete=TRUE) +
#theme_ipsum() +
theme(
legend.position="none",
panel.spacing = unit(0.1, "lines"),
strip.text.x = element_text(size = 8)
) +
facet_wrap(~`Performance Level`)+
labs( y = "",
title = "School Performance Level Distribution",
x = "% Students at Performance Level",
caption = "NextGen HS Physics MCAS")Although Mathematics and Data and Evidence, Reasoning, and Modeling items have strong and equal weighting in the HS Introductory Physics exam, student performance distributions are noticeably different across these practice categories.
HSPhy_NextGen_SchoolSum%>%
select(`ERM%`, `MD%`)%>%
pivot_longer(c(1:2), names_to = "Practice Cat", values_to = "% Points")%>%
ggplot( aes(x=`% Points`, color=`Practice Cat`, fill=`Practice Cat`)) +
geom_histogram(alpha=0.6, binwidth = 3) +
scale_fill_viridis(discrete=TRUE) +
scale_color_viridis(discrete=TRUE) +
#theme_ipsum() +
theme(
panel.spacing = unit(0.1, "lines"),
strip.text.x = element_text(size = 8)
) +
facet_wrap(~`Practice Cat`)+
labs( y = "",
title = "School Performance by Practice Category",
x = "% Points Earned",
caption = "NextGen HS Physics MCAS")When considering the variability of a school’s performance on items relative to the state by Practice Category, SD MD Diff, and SD ERM Diff, we can see that Mathematics and Data is skewed more to the right.
HSPhy_NextGen_SchoolSum%>%
select(`ERM Diff SD`, `MD Diff SD`)%>%
pivot_longer(c(1:2), names_to = "Practice Cat", values_to = "SD Diff")%>%
ggplot( aes(x=`SD Diff`, color=`Practice Cat`, fill=`Practice Cat`)) +
geom_histogram(alpha=0.6, binwidth = 3) +
scale_fill_viridis(discrete=TRUE) +
scale_color_viridis(discrete=TRUE) +
# theme_ipsum() +
theme(
panel.spacing = unit(0.1, "lines"),
strip.text.x = element_text(size = 8)
) +
labs( y = "",
title = "School Performance Variation by Practice Category",
x = "SD Diff",
caption = "NextGen HS Physics MCAS") +
facet_wrap(~`Practice Cat`)These images, seem to suggest that schools with the highest percentage of students classified as Exceeding expectations on the MCAS have the lowest levels of variation in performance on Mathematics and Data Items and schools with the lowest percentage of students classified as Exceeding expectations on the MCAS have the highest levels of variation in performance on Mathematics and Data Items.
HSPhy_NextGen_SchoolSum%>%
select(`EM Perf Stat`, `ERM Diff SD`, `MD Diff SD` )%>%
pivot_longer(c(2:3), names_to = "Practice Cat", values_to = "SD Diff")%>%
ggplot( aes(x= `EM Perf Stat`, y=`SD Diff`, fill= `EM Perf Stat`)) +
geom_boxplot() +
scale_fill_viridis(discrete = TRUE, alpha=0.6) +
geom_jitter(color="black", size=0.4, alpha=0.9) +
theme(
plot.title = element_text(size=11),
axis.title.x=element_blank(),
#axis.text.x=element_blank()
) +
labs( y = "SD Diff",
title = "Student Performance Variation by Practice Category",
x = "",
caption = "NextGen HS Physics MCAS") +
facet_wrap(~`Practice Cat`)HSPhy_NextGen_SchoolSum%>%
select(`EM Perf Stat`, `ERM Diff SD`, `MD Diff SD` )%>%
pivot_longer(c(2:3), names_to = "Practice Cat", values_to = "SD Diff")%>%
ggplot( aes(x= `Practice Cat`, y=`SD Diff`, fill= `Practice Cat`)) +
geom_boxplot() +
scale_fill_viridis(discrete = TRUE, alpha=0.6) +
geom_jitter(color="black", size=0.4, alpha=0.9) +
#theme_ipsum() +
theme(
plot.title = element_text(size=11),
axis.title.x=element_blank(),
axis.text.x=element_blank()
) +
labs( y = "SD Diff",
title = "Student Practice Cat. Variation by Achievement Level",
x = "",
caption = "NextGen HS Physics MCAS") +
#xlab("")+
facet_wrap(~`EM Perf Stat`)These images, seem to suggest that students at all schools seem to have more difficulty with Mathematics and Data items as compared to Evidence, Reasoning, and Modeling Items.
HSPhy_NextGen_SchoolSum%>%
select(`EM Perf Stat`, `ERM%`, `MD%` )%>%
pivot_longer(c(2:3), names_to = "Practice Cat", values_to = "%Points")%>%
ggplot( aes(x= `EM Perf Stat`, y=`%Points`, fill= `EM Perf Stat`)) +
geom_boxplot() +
scale_fill_viridis(discrete = TRUE, alpha=0.6) +
geom_jitter(color="black", size=0.4, alpha=0.9) +
#theme_ipsum() +
theme(
plot.title = element_text(size=11)
) +
labs( y = "%Points Earned",
title = "Student Practice Cat. Achievement by Performance Level",
x = "",
caption = "NextGen HS Physics MCAS") +
#xlab("")+
facet_wrap(~`Practice Cat`)HSPhy_NextGen_SchoolSum%>%
select(`EM Perf Stat`, `ERM%`, `MD%` )%>%
pivot_longer(c(2:3), names_to = "Practice Cat", values_to = "%Points")%>%
ggplot( aes(x= `Practice Cat`, y=`%Points`, fill= `Practice Cat`)) +
geom_boxplot() +
scale_fill_viridis(discrete = TRUE, alpha=0.6) +
geom_jitter(color="black", size=0.4, alpha=0.9) +
#theme_ipsum() +
theme(
plot.title = element_text(size=11)
) +
labs( y = "%Points Earned",
title = "Student Practice Cat. Achievement by Performance Level",
x = "",
caption = "NextGen HS Physics MCAS") +
#xlab("")+
facet_wrap(~`EM Perf Stat`, scale ="free_y")# HSPhy_NextGen_SchoolSum%>%
# select(`EM Perf Stat`, `ERMDiff`, `MDDiff` )%>%
# pivot_longer(c(2:3), names_to = "Practice Cat", values_to = "%Points")%>%
# ggplot( aes(x= `EM Perf Stat`, y=`%Points`, fill= `EM Perf Stat`)) +
# geom_boxplot() +
# scale_fill_viridis(discrete = TRUE, alpha=0.6) +
# geom_jitter(color="black", size=0.4, alpha=0.9) +
# #theme_ipsum() +
# theme(
#
# plot.title = element_text(size=11)
# ) +
# labs( y = "%Points Earned",
# title = "Student Practice Cat. Achievement by Performance Level",
# x = "",
# caption = "NextGen HS Physics MCAS") +
# #xlab("")+
# facet_wrap(~`Practice Cat`)
# HSPhy_NextGen_SchoolSum%>%
# select(`EM Perf Stat`, `ERM%`, `MD%` )%>%
# pivot_longer(c(2:3), names_to = "Practice Cat", values_to = "%Points")%>%
# ggplot( aes(x= `Practice Cat`, y=`%Points`, fill= `Practice Cat`)) +
# geom_boxplot() +
# scale_fill_viridis(discrete = TRUE, alpha=0.6) +
# geom_jitter(color="black", size=0.4, alpha=0.9) +
# #theme_ipsum() +
# theme(
#
# plot.title = element_text(size=11)
# ) +
# labs( y = "%Points Earned",
# title = "Student Practice Cat. Achievement by Performance Level",
# x = "",
# caption = "NextGen HS Physics MCAS") +
# #xlab("")+
# facet_wrap(~`EM Perf Stat`, scale ="free_y")
# HSPhy_NextGen_SchoolSum %>%
# ggplot( aes(x= `EM Perf Stat`, y=`MD%`, fill= `EM Perf Stat`)) +
# geom_boxplot() +
# scale_fill_viridis(discrete = TRUE, alpha=0.6) +
# geom_jitter(color="black", size=0.4, alpha=0.9) +
# theme_ipsum() +
# theme(
# legend.position="none",
# plot.title = element_text(size=11)
# ) +
# ggtitle("A boxplot with jitter") +
# xlab("")
# HSPhy_NextGen_SchoolSum %>%
# ggplot( aes(x= `EM Perf Stat`, y=`MD Diff SD`, fill= `EM Perf Stat`)) +
# geom_boxplot() +
# scale_fill_viridis(discrete = TRUE, alpha=0.6) +
# geom_jitter(color="black", size=0.4, alpha=0.9) +
# theme_ipsum() +
# theme(
# legend.position="none",
# plot.title = element_text(size=11)
# ) +
# ggtitle("A boxplot with jitter") +
# xlab("")
#
# HSPhy_NextGen_SchoolSum %>%
# ggplot( aes(x= `EM Perf Stat`, y=`ERM Diff SD`, fill= `EM Perf Stat`)) +
# geom_boxplot() +
# scale_fill_viridis(discrete = TRUE, alpha=0.6) +
# geom_jitter(color="black", size=0.4, alpha=0.9) +
# theme_ipsum() +
# theme(
# legend.position="none",
# plot.title = element_text(size=11)
# ) +
# ggtitle("A boxplot with jitter") +
# xlab("")
#
# HSPhy_NextGen_SchoolSum %>%
# ggplot( aes(x= `EM Perf Stat`, y=`ERM%`, fill= `EM Perf Stat`)) +
# geom_boxplot() +
# scale_fill_viridis(discrete = TRUE, alpha=0.6) +
# geom_jitter(color="black", size=0.4, alpha=0.9) +
# theme_ipsum() +
# theme(
# legend.position="none",
# plot.title = element_text(size=11)
# ) +
# ggtitle("A boxplot with jitter") +
# xlab("")Here we can visualize the variability of a school’s performance on items partitioned by Content Reporting Category of Motion and Forces, Energy, and Waves via: MF%/SD MF Diff, EN%/SD EN Diff, and WA%/SD WA Diff.
HSPhy_NextGen_SchoolSum%>%
select(`EM Perf Stat`, `MF Diff SD`, `EN Diff SD`, `WA Diff SD` )%>%
pivot_longer(c(2:4), names_to = "Report Cat", values_to = "SD Diff")%>%
ggplot( aes(x=`SD Diff`, color=`Report Cat`, fill=`Report Cat`)) +
geom_histogram(alpha=0.6, binwidth = 3) +
scale_fill_viridis(discrete=TRUE) +
scale_color_viridis(discrete=TRUE) +
#theme_ipsum() +
theme(
panel.spacing = unit(0.1, "lines"),
strip.text.x = element_text(size = 8)
) +
labs( y = "",
title = "School Performance Variation by Content Reporting Category",
x = "SD Diff",
caption = "NextGen HS Physics MCAS") +
facet_wrap(~`Report Cat`)HSPhy_NextGen_SchoolSum%>%
select(`EM Perf Stat`, `MF%`, `EN%`, `WA%` )%>%
pivot_longer(c(2:4), names_to = "Report Cat", values_to = "% Points")%>%
ggplot( aes(x=`% Points`, color=`Report Cat`, fill=`Report Cat`)) +
geom_histogram(alpha=0.6, binwidth = 3) +
scale_fill_viridis(discrete=TRUE) +
scale_color_viridis(discrete=TRUE) +
#theme_ipsum() +
theme(
panel.spacing = unit(0.1, "lines"),
strip.text.x = element_text(size = 8)
) +
facet_wrap(~`Report Cat`)+
labs( y = "",
title = "Student Performance by Content Reporting Category",
x = "% Points Earned",
caption = "NextGen HS Physics MCAS")These images suggest that most schools exhibit similar levels of variability in performance relative to the state across all reporting categories. Schools with the lowest percentage of students Exceeding expectations exhibit high variability in performance across all content reporting categories, but seem to have lower variability on Waves items.
HSPhy_NextGen_SchoolSum%>%
select(`EM Perf Stat`, `MF Diff SD`, `EN Diff SD`, `WA Diff SD` )%>%
pivot_longer(c(2:4), names_to = "Report Cat", values_to = "SD Diff")%>%
ggplot( aes(x= `EM Perf Stat`, y=`SD Diff`, fill= `EM Perf Stat`)) +
geom_boxplot() +
scale_fill_viridis(discrete = TRUE, alpha=0.6) +
geom_jitter(color="black", size=0.4, alpha=0.9) +
theme(
plot.title = element_text(size=11),
axis.title.x=element_blank(),
axis.text.x=element_blank()
) +
labs( y = "SD Diff",
title = "School Performance Variation by Content Reporting Category",
x = "",
caption = "NextGen HS Physics MCAS") +
facet_wrap(~`Report Cat`)HSPhy_NextGen_SchoolSum%>%
select(`EM Perf Stat`, `MF Diff SD`, `EN Diff SD`, `WA Diff SD` )%>%
pivot_longer(c(2:4), names_to = "Report Cat", values_to = "SD Diff")%>%
ggplot( aes(x= `Report Cat`, y=`SD Diff`, fill= `Report Cat`)) +
geom_boxplot() +
scale_fill_viridis(discrete = TRUE, alpha=0.6) +
geom_jitter(color="black", size=0.4, alpha=0.9) +
#theme_ipsum() +
theme(
plot.title = element_text(size=11),
axis.title.x=element_blank(),
axis.text.x=element_blank()
) +
labs( y = "SD Diff",
title = "School Content Reporting Cat. Variation by Achievement Level",
x = "",
caption = "NextGen HS Physics MCAS") +
#xlab("")+
facet_wrap(~`EM Perf Stat`)HSPhy_NextGen_SchoolSum%>%
select(`EM Perf Stat`, `MF%`, `EN%`, `WA%` )%>%
pivot_longer(c(2:4), names_to = "Report Cat", values_to = "% Points")%>%
ggplot( aes(x= `Report Cat`, y=`% Points`, fill= `Report Cat`)) +
geom_boxplot() +
scale_fill_viridis(discrete = TRUE, alpha=0.6) +
geom_jitter(color="black", size=0.4, alpha=0.9) +
#theme_ipsum() +
theme(
plot.title = element_text(size=11),
axis.title.x=element_blank(),
axis.text.x=element_blank()
) +
labs( y = "Report Cat%",
title = "School Content Reporting Cat. Performance by Achievement Level",
x = "",
caption = "NextGen HS Physics MCAS") +
#xlab("")+
facet_wrap(~`EM Perf Stat`)# HSPhy_NextGen_SchoolSum %>%
# ggplot( aes(x= `EM Perf Stat`, y=`MF%`, fill= `EM Perf Stat`)) +
# geom_boxplot() +
# scale_fill_viridis(discrete = TRUE, alpha=0.6) +
# geom_jitter(color="black", size=0.4, alpha=0.9) +
# theme_ipsum() +
# theme(
# legend.position="none",
# plot.title = element_text(size=11)
# ) +
# ggtitle("A boxplot with jitter") +
# xlab("")
#
# HSPhy_NextGen_SchoolSum %>%
# ggplot( aes(x= `EM Perf Stat`, y=`MF Diff SD`, fill= `EM Perf Stat`)) +
# geom_boxplot() +
# scale_fill_viridis(discrete = TRUE, alpha=0.6) +
# geom_jitter(color="black", size=0.4, alpha=0.9) +
# theme_ipsum() +
# theme(
# legend.position="none",
# plot.title = element_text(size=11)
# ) +
# ggtitle("A boxplot with jitter") +
# xlab("")
#
#
#
# HSPhy_NextGen_SchoolSum %>%
# ggplot( aes(x= `EM Perf Stat`, y=`EN%`, fill= `EM Perf Stat`)) +
# geom_boxplot() +
# scale_fill_viridis(discrete = TRUE, alpha=0.6) +
# geom_jitter(color="black", size=0.4, alpha=0.9) +
# theme_ipsum() +
# theme(
# legend.position="none",
# plot.title = element_text(size=11)
# ) +
# ggtitle("A boxplot with jitter") +
# xlab("")
#
# HSPhy_NextGen_SchoolSum %>%
# ggplot( aes(x= `EM Perf Stat`, y=`EN Diff SD`, fill= `EM Perf Stat`)) +
# geom_boxplot() +
# scale_fill_viridis(discrete = TRUE, alpha=0.6) +
# geom_jitter(color="black", size=0.4, alpha=0.9) +
# theme_ipsum() +
# theme(
# legend.position="none",
# plot.title = element_text(size=11)
# ) +
# ggtitle("A boxplot with jitter") +
# xlab("")
#
# HSPhy_NextGen_SchoolSum %>%
# ggplot( aes(x= `EM Perf Stat`, y=`WA%`, fill= `EM Perf Stat`)) +
# geom_boxplot() +
# scale_fill_viridis(discrete = TRUE, alpha=0.6) +
# geom_jitter(color="black", size=0.4, alpha=0.9) +
# theme_ipsum() +
# theme(
# legend.position="none",
# plot.title = element_text(size=11)
# ) +
# ggtitle("A boxplot with jitter") +
# xlab("")
#
# HSPhy_NextGen_SchoolSum %>%
# ggplot( aes(x= `EM Perf Stat`, y=`WA Diff SD`, fill= `EM Perf Stat`)) +
# geom_boxplot() +
# scale_fill_viridis(discrete = TRUE, alpha=0.6) +
# geom_jitter(color="black", size=0.4, alpha=0.9) +
# theme_ipsum() +
# theme(
# legend.position="none",
# plot.title = element_text(size=11)
# ) +
# ggtitle("A boxplot with jitter") +
# xlab("") Df Sum Sq Mean Sq F value Pr(>F)
`EM Perf Stat` 4 642.5 160.62 23.5 5.82e-14 ***
Residuals 107 731.2 6.83
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Df Sum Sq Mean Sq F value Pr(>F)
`EM Perf Stat` 4 311.8 77.95 13.33 7.58e-09 ***
Residuals 107 625.7 5.85
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Df Sum Sq Mean Sq F value Pr(>F)
`EM Perf Stat` 4 137.6 34.39 4.802 0.00133 **
Residuals 107 766.4 7.16
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Df Sum Sq Mean Sq F value Pr(>F)
`EM Perf Stat` 4 446.3 111.57 14.7 1.32e-09 ***
Residuals 107 811.8 7.59
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Df Sum Sq Mean Sq F value Pr(>F)
`EM Perf Stat` 4 447.2 111.79 19.59 4.02e-12 ***
Residuals 107 610.5 5.71
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Call:
lm(formula = `EorM%` ~ (`MD Diff SD`), data = HSPhy_NextGen_SchoolSum)
Residuals:
Min 1Q Median 3Q Max
-32.575 -18.347 -4.778 15.525 81.212
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 73.1813 5.6876 12.867 < 2e-16 ***
`MD Diff SD` -3.9674 0.6127 -6.476 2.73e-09 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 22.71 on 110 degrees of freedom
Multiple R-squared: 0.276, Adjusted R-squared: 0.2694
F-statistic: 41.94 on 1 and 110 DF, p-value: 2.726e-09
This states that MD is significant but ERM is not statistically significant?
Call:
lm(formula = `EorM%` ~ (`ERM Diff SD` + `MD Diff SD`), data = HSPhy_NextGen_SchoolSum)
Residuals:
Min 1Q Median 3Q Max
-32.499 -18.077 -4.584 16.254 81.167
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 74.0590 6.7612 10.953 < 2e-16 ***
`ERM Diff SD` -0.3279 1.3517 -0.243 0.80876
`MD Diff SD` -3.7413 1.1166 -3.351 0.00111 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 22.81 on 109 degrees of freedom
Multiple R-squared: 0.2764, Adjusted R-squared: 0.2631
F-statistic: 20.82 on 2 and 109 DF, p-value: 2.201e-08
Call:
lm(formula = `EorM%` ~ (`EN Diff SD`), data = HSPhy_NextGen_SchoolSum)
Residuals:
Min 1Q Median 3Q Max
-33.548 -18.356 -4.079 14.269 80.867
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 70.0957 6.1134 11.466 < 2e-16 ***
`EN Diff SD` -3.6402 0.6676 -5.453 3.07e-07 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 23.68 on 110 degrees of freedom
Multiple R-squared: 0.2128, Adjusted R-squared: 0.2056
F-statistic: 29.74 on 1 and 110 DF, p-value: 3.073e-07
Call:
lm(formula = `EorM%` ~ (`MF Diff SD`), data = HSPhy_NextGen_SchoolSum)
Residuals:
Min 1Q Median 3Q Max
-31.602 -19.591 -3.146 13.321 82.578
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 78.6411 6.4338 12.223 < 2e-16 ***
`MF Diff SD` -4.5448 0.6968 -6.522 2.18e-09 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 22.66 on 110 degrees of freedom
Multiple R-squared: 0.2789, Adjusted R-squared: 0.2723
F-statistic: 42.54 on 1 and 110 DF, p-value: 2.182e-09
Call:
lm(formula = `EorM%` ~ (`WA Diff SD`), data = HSPhy_NextGen_SchoolSum)
Residuals:
Min 1Q Median 3Q Max
-46.473 -21.430 -1.154 16.703 79.358
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 62.9282 7.4207 8.480 1.15e-13 ***
`WA Diff SD` -2.8688 0.8444 -3.397 0.000948 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 25.39 on 110 degrees of freedom
Multiple R-squared: 0.09496, Adjusted R-squared: 0.08673
F-statistic: 11.54 on 1 and 110 DF, p-value: 0.0009481
Call:
lm(formula = `EorM%` ~ (`MF Diff SD`) + `EN Diff SD` + `MF Diff SD` *
`EN Diff SD`, data = HSPhy_NextGen_SchoolSum)
Residuals:
Min 1Q Median 3Q Max
-31.095 -19.080 -3.607 13.209 82.025
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 88.5460 15.4514 5.731 9.16e-08 ***
`MF Diff SD` -5.1276 1.9197 -2.671 0.00873 **
`EN Diff SD` -1.6326 2.1240 -0.769 0.44380
`MF Diff SD`:`EN Diff SD` 0.1096 0.1584 0.692 0.49069
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 22.81 on 108 degrees of freedom
Multiple R-squared: 0.2829, Adjusted R-squared: 0.2629
F-statistic: 14.2 on 3 and 108 DF, p-value: 7.241e-08
Call:
lm(formula = `EorM%` ~ (`MF Diff SD`) + `WA Diff SD` + `MF Diff SD` *
`WA Diff SD`, data = HSPhy_NextGen_SchoolSum)
Residuals:
Min 1Q Median 3Q Max
-30.132 -18.281 -4.656 13.137 78.167
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 91.2521 20.3203 4.491 1.79e-05 ***
`MF Diff SD` -7.6468 2.4499 -3.121 0.00231 **
`WA Diff SD` -0.2917 2.5515 -0.114 0.90919
`MF Diff SD`:`WA Diff SD` 0.2133 0.2450 0.871 0.38595
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 22.55 on 108 degrees of freedom
Multiple R-squared: 0.299, Adjusted R-squared: 0.2795
F-statistic: 15.35 on 3 and 108 DF, p-value: 2.181e-08
Call:
lm(formula = `EorM%` ~ (`MD Diff SD`) + `WA Diff SD` + `MD Diff SD` *
`WA Diff SD`, data = HSPhy_NextGen_SchoolSum)
Residuals:
Min 1Q Median 3Q Max
-33.160 -16.523 -5.964 14.101 69.660
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 104.1191 18.7929 5.540 2.15e-07 ***
`MD Diff SD` -10.0194 2.5033 -4.002 0.000115 ***
`WA Diff SD` -1.7848 2.1050 -0.848 0.398373
`MD Diff SD`:`WA Diff SD` 0.4548 0.2177 2.089 0.039048 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 22.18 on 108 degrees of freedom
Multiple R-squared: 0.3218, Adjusted R-squared: 0.303
F-statistic: 17.09 on 3 and 108 DF, p-value: 3.762e-09
Call:
lm(formula = `EorM%` ~ (`MD Diff SD`) + `MF Diff SD` + `MD Diff SD` *
`MF Diff SD`, data = HSPhy_NextGen_SchoolSum)
Residuals:
Min 1Q Median 3Q Max
-32.536 -16.951 -4.496 12.527 83.141
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 92.7357 14.3853 6.447 3.29e-09 ***
`MD Diff SD` -4.5226 2.6891 -1.682 0.0955 .
`MF Diff SD` -3.3337 1.9103 -1.745 0.0838 .
`MD Diff SD`:`MF Diff SD` 0.1680 0.1427 1.178 0.2415
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 22.57 on 108 degrees of freedom
Multiple R-squared: 0.298, Adjusted R-squared: 0.2785
F-statistic: 15.28 on 3 and 108 DF, p-value: 2.35e-08
Call:
lm(formula = `EorM%` ~ (`MD Diff SD`) + `EN Diff SD` + `MD Diff SD` *
`EN Diff SD`, data = HSPhy_NextGen_SchoolSum)
Residuals:
Min 1Q Median 3Q Max
-34.020 -17.378 -4.997 12.832 86.410
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 90.9327 13.8560 6.563 1.89e-09 ***
`MD Diff SD` -6.4081 2.1149 -3.030 0.00306 **
`EN Diff SD` -1.4283 1.7193 -0.831 0.40797
`MD Diff SD`:`EN Diff SD` 0.1842 0.1292 1.426 0.15687
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 22.7 on 108 degrees of freedom
Multiple R-squared: 0.2894, Adjusted R-squared: 0.2697
F-statistic: 14.66 on 3 and 108 DF, p-value: 4.462e-08
Call:
lm(formula = `EorM%` ~ `MD%` + `ERM%` + `MD%` * `ERM%`, data = HSPhy_NextGen_SchoolSum)
Residuals:
Min 1Q Median 3Q Max
-18.9971 -2.5815 0.1229 2.3757 16.8436
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -31.643553 5.538802 -5.713 9.92e-08 ***
`MD%` 0.801079 0.206932 3.871 0.000186 ***
`ERM%` 0.148625 0.203262 0.731 0.466240
`MD%`:`ERM%` 0.009329 0.002118 4.404 2.51e-05 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 4.473 on 108 degrees of freedom
Multiple R-squared: 0.9724, Adjusted R-squared: 0.9717
F-statistic: 1269 on 3 and 108 DF, p-value: < 2.2e-16
Call:
lm(formula = `EorM%` ~ `MD%` + `WA%` + `MD%` * `WA%`, data = HSPhy_NextGen_SchoolSum)
Residuals:
Min 1Q Median 3Q Max
-17.1752 -3.0657 -0.1061 2.2054 19.8722
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -35.163765 4.730111 -7.434 2.63e-11 ***
`MD%` 1.290583 0.161619 7.985 1.63e-12 ***
`WA%` -0.015233 0.205137 -0.074 0.94094
`MD%`:`WA%` 0.006053 0.002141 2.828 0.00559 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 4.729 on 108 degrees of freedom
Multiple R-squared: 0.9692, Adjusted R-squared: 0.9683
F-statistic: 1132 on 3 and 108 DF, p-value: < 2.2e-16
Call:
lm(formula = `EorM%` ~ `ERM%` + `WA%` + `ERM%` * `WA%`, data = HSPhy_NextGen_SchoolSum)
Residuals:
Min 1Q Median 3Q Max
-16.8575 -3.3910 0.0816 2.9314 13.5817
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -46.744066 6.471302 -7.223 7.53e-11 ***
`ERM%` 0.993748 0.175196 5.672 1.19e-07 ***
`WA%` 0.315195 0.227940 1.383 0.170
`ERM%`:`WA%` 0.007465 0.002715 2.749 0.007 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 5.197 on 108 degrees of freedom
Multiple R-squared: 0.9628, Adjusted R-squared: 0.9617
F-statistic: 931 on 3 and 108 DF, p-value: < 2.2e-16
Call:
lm(formula = `EorM%` ~ `ERM%` + `MD%` + `MF%` + `MD%` * `ERM%` +
`MD%` * `MF%` + `ERM%` * `MF%`, data = HSPhy_NextGen_SchoolSum)
Residuals:
Min 1Q Median 3Q Max
-14.4908 -2.3554 0.2093 1.9293 14.2689
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -19.38914 9.12900 -2.124 0.036026 *
`ERM%` 1.37435 0.84576 1.625 0.107162
`MD%` 4.55083 0.89029 5.112 1.44e-06 ***
`MF%` -5.12934 1.11268 -4.610 1.14e-05 ***
`ERM%`:`MD%` -0.07223 0.02178 -3.317 0.001251 **
`MD%`:`MF%` 0.01931 0.01667 1.158 0.249291
`ERM%`:`MF%` 0.06161 0.01704 3.616 0.000461 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 4.083 on 105 degrees of freedom
Multiple R-squared: 0.9777, Adjusted R-squared: 0.9764
F-statistic: 765.9 on 6 and 105 DF, p-value: < 2.2e-16
Call:
lm(formula = `EorM%` ~ `ERM%` + `MD%` + `WA%` + `MD%` * `ERM%` +
`MD%` * `WA%` + `ERM%` * `WA%`, data = HSPhy_NextGen_SchoolSum)
Residuals:
Min 1Q Median 3Q Max
-15.1584 -2.1387 0.2441 2.1956 14.8754
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -27.02886 8.91137 -3.033 0.00305 **
`ERM%` -0.79907 0.62833 -1.272 0.20628
`MD%` -0.73390 0.89506 -0.820 0.41410
`WA%` 2.44743 0.76083 3.217 0.00172 **
`ERM%`:`MD%` 0.05509 0.01285 4.285 4.06e-05 ***
`MD%`:`WA%` -0.02285 0.01265 -1.806 0.07385 .
`ERM%`:`WA%` -0.02437 0.02027 -1.202 0.23192
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 4.187 on 105 degrees of freedom
Multiple R-squared: 0.9765, Adjusted R-squared: 0.9752
F-statistic: 727.3 on 6 and 105 DF, p-value: < 2.2e-16
(Intercept) `MD Diff SD` `WA Diff SD`
104.119115 -10.019434 -1.784838
`MD Diff SD`:`WA Diff SD`
0.454809
Call:
lm(formula = `EorM%` ~ (`MD Diff SD`) + `WA Diff SD` + `MD Diff SD` *
`WA Diff SD`, data = HSPhy_NextGen_SchoolSum)
Residuals:
Min 1Q Median 3Q Max
-33.160 -16.523 -5.964 14.101 69.660
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 104.1191 18.7929 5.540 2.15e-07 ***
`MD Diff SD` -10.0194 2.5033 -4.002 0.000115 ***
`WA Diff SD` -1.7848 2.1050 -0.848 0.398373
`MD Diff SD`:`WA Diff SD` 0.4548 0.2177 2.089 0.039048 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 22.18 on 108 degrees of freedom
Multiple R-squared: 0.3218, Adjusted R-squared: 0.303
F-statistic: 17.09 on 3 and 108 DF, p-value: 3.762e-09
#fit_erm = lm(`E%` ~ `ERM Diff SD`, data = HSPhy_NextGen_SchoolSum)
#summary(fit_erm)
#fit_erm_md = lm(`E%` ~ log(`MD Diff SD`) + log(`ERM Diff SD`) + log(`MD Diff SD`)*log(`ERM Diff SD`), data = HSPhy_NextGen_SchoolSum)
#summary(fit_erm_md)
#fit_md_percent = lm(`E%` ~ log(`MD%`) + log(`ERM%`) + log(`MD%`)*log(`ERM%`), data = HSPhy_NextGen_SchoolSum)
#summary(fit_md_percent)
HSPhy_NextGen_SchoolSum%>%
select(`MD%`, `E%`)
Call:
lm(formula = (`E%`) ~ log(`WA%`), data = HSPhy_NextGen_SchoolSum)
Residuals:
Min 1Q Median 3Q Max
-10.575 -5.428 -2.218 4.028 33.155
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -104.236 9.087 -11.47 <2e-16 ***
log(`WA%`) 29.999 2.408 12.46 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 7.761 on 110 degrees of freedom
Multiple R-squared: 0.5853, Adjusted R-squared: 0.5816
F-statistic: 155.3 on 1 and 110 DF, p-value: < 2.2e-16
Call:
lm(formula = (`E%`) ~ log(`MD%`), data = HSPhy_NextGen_SchoolSum)
Residuals:
Min 1Q Median 3Q Max
-11.069 -5.617 -1.981 3.093 35.895
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -93.347 8.951 -10.43 <2e-16 ***
log(`MD%`) 26.803 2.344 11.43 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 8.147 on 110 degrees of freedom
Multiple R-squared: 0.5431, Adjusted R-squared: 0.5389
F-statistic: 130.7 on 1 and 110 DF, p-value: < 2.2e-16
#fit_erm = lm(`E%` ~ `ERM Diff SD`, data = HSPhy_NextGen_SchoolSum)
#summary(fit_erm)
#fit_erm_md = lm(`E%` ~ log(`MD Diff SD`) + log(`ERM Diff SD`) + log(`MD Diff SD`)*log(`ERM Diff SD`), data = HSPhy_NextGen_SchoolSum)
#summary(fit_erm_md)
#fit_md_percent = lm(`E%` ~ log(`MD%`) + log(`ERM%`) + log(`MD%`)*log(`ERM%`), data = HSPhy_NextGen_SchoolSum)
#summary(fit_md_percent)
#HSPhy_NextGen_SchoolSum%>%
# select(`MD%`, `E%`)
ggplot(data = HSPhy_NextGen_SchoolSum, aes(x = log(`WA%`), y = log(`E%`))) +
geom_point() +
geom_smooth(method="lm", se=T)---
title: "DACSS603Final"
author: "Theresa Szczepanski"
desription: "MCAS G9 Science Analysis"
date: "10/22/2023"
format:
html:
embed-resources: true
self-contained-math: true
df-print: paged
toc: true
code-fold: true
code-copy: true
code-tools: true
bibliography: references.bib
editor:
markdown:
wrap: 72
---
```{r}
#| label: setup
#| warning: false
#| message: false
source('dependencies.R')
knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)
#install.packages("stargazer")
library(stargazer)
```
# Research Questions
The Massachusetts Education Reform Act in 1993 was passed in the context
of a national movement toward education reform throughout the United
States. As early as 1989 there were calls to establish national
curriculum standards as a way to improve student college and career
readiness skills and close poverty gaps [@Greer18]. Massachusetts
Comprehensive Assessment System (MCAS) tests were introduced as part of
the Massachusetts Education Reform Act.
The MCAS tests are a significant tool for educational equity. Scores on
the Grade 10 Math MCAS test "predict longer-term educational attainments
and labor market success, above and beyond typical markers of student
advantage" and differences among students are largely and "sometimes completely
accounted for" by differences in 10th grade MCAS scores and educational attainments.
[@Boats20].
With the introduction of the new Common Core standards and accountability testing came the demand
for aligned curricular materials and teaching practices. Research
indicates that the choice of instructional materials can have an impact
"as large as or larger than the impact of teacher quality" [@Blindly12].
Massachusetts, along with Arkansas, Delaware, Kentucky, Louisiana,
Maryland, Mississippi, Nebraska, New Mexico, Ohio, Rhode Island,
Tennessee, and Texas belongs to the Council of Chief State School
Officers' (CCSO), [High Quality Instructional Materials and Professional
Development
network](https://learning.ccsso.org/high-quality-instructional-materials)
which aims to close the "opportunity gap" among students by ensuring that
every teacher has access to high-quality, standards aligned
instructional materials and receives relevant professional development
to support their use of these materials [@IMPD21].
All Massachusetts Public School students must complete a High School science MCAS
exam providing a wealth of standardized data on students' discipline specific skill development.
All schools receive annual summary reports on student performance. Significant work has been
done using the MCAS achievement data and the Student Opportunity Act to identify achievement gaps and address funding inequities across the Commonwealth [@Boats20]. With funding gaps outlined in the late 1990's
closing, one could consider how the MCAS data could be leveraged to support the state's
current high quality instructional materials initiatives. The state compiles school's performance
disaggregated by each MCAS question item [@MCASIT].
Using the curricular information provided in state wide Next Generation MCAS High School Introductory Physics
Item reports together with school-level student performance data, we hope to address the following broad
questions:
```{=html}
<style>
div.blue { background-color:#e6f0ff; border-radius: 5px; padding: 20px;}
</style>
```
::: blue
- Is there a relationship between differences in a school's performance
across Science Practice Categories and a school's overall achievement on the Introductory Physics exam?
- How can trends in a school's performance be used to provide schools with guidance on
discipline-specific curricular areas to target to improve student achievement?
:::
In this report, I will analyze the High School Introductory Physics Next
Generation [Massachusetts Comprehensive Assessment System
(MCAS)](https://www.doe.mass.edu/mcas/default.html) tests results for
Massachusetts public schools.
Data for the study were drawn from DESE’s Next Generation MCAS Test
[Achievement Results statewide report](https://profiles.doe.mass.edu/statereport/mcas.aspx),
[Item Analysis statewide report](https://profiles.doe.mass.edu/statereport/nextgenmcas_item.aspx), and
the [MCAS digital item library](https://mcas.digitalitemlibrary.com/home?subject=Science&grades=Physics&view=ALL). The Next Generation High School Introductory Physics MCAS assessment consists of 42 multiple choice and constructed response items that assess students on Physical Science standards from the [2016 STE Massachusetts
Curriculum Framework](https://www.doe.mass.edu/frameworks/scitech/2016-04.pdf)
in the content `Reporting Categories` of Motions and Forces, `MF`,
Energy, `EN`, and Waves, `WA`. Each item is associated with a specific content standard from the Massachusetts Curriculum Framework as well as an underlying science `Practice Category` of
Evidence Reasoning and Modeling, `ERM`, Mathematics and Data, `MD`, or Investigations
and Questioning, `IQ`. The State Item Report provides the percentage of points earned by
students in a school for each item as well as the percentage of points earned by
all students in the state for each item.
The `HSPhy_NextGen_SchoolSum` data frame contains summary performance results from
112 public schools across the commonwealth on the Next Generation High
School Introductory Physics MCAS, which was administered in the Spring
of 2022 and 2023. 87 schools
tested students in both years and 25 schools only tested
students in 1 of the 2 testing years, with 27,745 students completing the exam.
For each school, there are values reported for 44 different variables
which consist of information from three broad categories
- *School Characteristics*: This includes the name of the school and
the size of the school, `School Size`, as determined by the number of students that
completed the MCAS exam.
- *Discipline-Specifc Performance Metrics*: This
includes the percentage of points earned by students at a school for items
each content `Reporting Category`, `MF%`, `EN%`, `WA%` and science `Practice Category`
`ERM%`, `MD%`, `IQ%`, the difference between a school's percentage of points
earned compared to the percentage of points earned by all students in the state (`MFDiff`, `ENDiff`, etc...), and the variability in a school's performance relative to the state by category as
measured by the standard deviation of the school's `Diff` across categories (`SD MF Diff`, `SD EN Diff`, etc...).
- *Aggregate Performance Level metrics*: This includes a school's percentage
of students at each of the four `Performance Levels`, (`E%`: Exceeding
Expectations, `M%`: Meeting Expectations, `PM%`: Partially Meeting
Expectations, and `NM%`: Not Meeting Expectations),
the difference between these percentages and the percentage of students
in Massachusetts at each performance level (`EDiff`, `MDiff`, `PMDiff`, `NMDiff`),
and an ordinal classification of school's, `EM Perf Stat` based on the percentage of students that were
classified as Exceeding or Meeting expectations on the exam (`HighEM`, `HighM`, `Mid`, `Mid-Low`, `Low`).
See the `HSPhy_NextGenMCASDF` data frame summary and **codebook** for
further details about all variables.
# Hypothesis
```{=html}
<style>
div.blue { background-color:#e6f0ff; border-radius: 5px; padding: 20px;}
</style>
```
::: blue
- A school's percentage of students classified as `Exceeding` expectations on the
Introductory Physics MCAS is negatively associated with a school's variance in
performance relative to students in the state on `Mathematics and Data` items, `SD MD Diff`.
- A school's summary performance on items in a given content `Reporting Category` as measured by `MF%`, `EN%`, and `WA%`, is positively associated with the `Reporting Category's` weight within the exam.
:::
# Descriptive Statistics
```{r}
#| label: dataframe setup
#| warning: false
#| message: false
#HSPhy_NextGen_SchoolSum
HSPhy_NextGen_SchoolSum<-HSPhy_NextGen_SchoolSum%>%
ungroup()
#HSPhy_NextGen_SchoolSum
# HSPhy_NextGen_PerfDF
# HSPhy_NextGen_SchoolIT301DF
HSPhy_2023_SchoolSizeDF<-read_excel("data/2023_Physics_NextGenMCASItem.xlsx", skip = 1)%>%
select(`School Name`, `School Code`, `Tested`)%>%
mutate(`Tested` = as.integer(`Tested`))%>%
select(`School Name`, `School Code`, `Tested`)
HSPhy_2022_SchoolSizeDF<-read_excel("data/2022_Physics_NextGenMCASItem.xlsx", skip = 1)%>%
select(`School Name`, `School Code`, `Tested`)%>%
mutate(`Tested` = as.integer(`Tested`))%>%
select(`School Name`, `School Code`, `Tested`)
HSPhy_SchoolSize <- rbind(HSPhy_2023_SchoolSizeDF, HSPhy_2022_SchoolSizeDF)%>%
mutate(count = 1)%>%
group_by(`School Name`, `School Code`)%>%
summarise(count = sum(count),
`Tested` = sum(`Tested`))%>%
mutate(`Tested Count` = round(`Tested`/count))%>%
ungroup()
#HSPhy_SchoolSize
quantile <- quantile(HSPhy_SchoolSize$`Tested Count`)
HSPhy_Size<-HSPhy_SchoolSize%>%
mutate(`School Size` = case_when(
`Tested Count` <= quantile[2] ~ "Small",
`Tested Count` > quantile[2] &
`Tested Count` <= quantile[3] ~ "Low-Mid",
`Tested Count` > quantile[3] &
`Tested Count` <= quantile[4] ~ "Upper-Mid",
`Tested Count` > quantile[4] &
`Tested Count` <= quantile[5] ~ "Large",
))%>%
mutate(`School Size` = recode_factor(`School Size`,
"Small" = "Small",
"Low-Mid" = "Low-Mid",
"Upper-Mid" = "Upper-Mid",
"Large" = "Large",
.ordered = TRUE))%>%
select(`School Name`, `School Code`, `School Size`)
#HSPhy_Size
HSPhy_NextGen_SchoolSum<-HSPhy_NextGen_SchoolSum%>%
left_join(HSPhy_Size, by = c("School Name" = "School Name", "School Code" = "School Code"))%>%
mutate(`EMDiff` = `EDiff` + `MDiff`)%>%
mutate(`EM Perf Stat` = case_when(
`EDiff` > 0 & `EDiff` + `MDiff` > 0 ~ "HighEM",
`EDiff` <= 0 & `EDiff` + `MDiff` > 0 ~ "HighM",
#`EMDiff` > quantile(HSPhy_NextGen_SchoolSum$`EMDiff`)[3] &
`EMDiff` <= 0 & `EMDiff` > -14 ~ "Mid",
`EMDiff` <= -14 & `EMDiff` >= -33 ~ "Mid-Low",
`EMDiff` < -33 ~ "Low"
))%>%
mutate(`EM Perf Stat` = recode_factor(`EM Perf Stat`,
"HighEM" = "HighEM",
"HighM" = "HighM",
"Mid" = "Mid",
"Mid-Low" = "Mid-Low",
"Low" = "Low",
.ordered = TRUE))
HSPhy_NextGen_SchoolSum
#quantile(HSPhy_NextGen_SchoolSum$`EMDiff`)
#summary(HSPhy_NextGen_SchoolSum)
print(summarytools::dfSummary(HSPhy_NextGen_SchoolSum,
varnumbers = FALSE,
plain.ascii = FALSE,
style = "grid",
graph.magnif = 0.70,
valid.col = FALSE),
method = 'render',
table.classes = 'table-condensed')
```
## Key Variables
To explore the relationship between the distribution of school's students' `Performance Level`
and school's performance in content categories, we examine the percentage of points
earned by students at schools as well as the standard deviation of the difference between points earned by students at a school and points earned by students in the state across `Reporting Categories` and `Practice Categories`. We grouped schools by their `EM Perf Stat`, an ordinal variable classifying schools
by the percentage of students they have that were classified as either Exceeding or
Meeting expectations on the MCAS. These numbers seem to suggest that items classified
with the Science `Practice Category` of `Mathematics and Data` seem to be more challenging to students than those
classified as `Evidence, Reasoning, and Modeling`. These practice categories are strongly and equally emphasized within the exam; items tagged with these categories account for **82%** of the available points on the exam with exactly **41%** of available points coming from each category.
When considering content `Reporting Categories`, there do not seem to be discernible distinctions between `EM Perf Stat` and school's achievement and performance
across categories. All schools seem to perform the strongest on `Motion and Forces` items, followed by `Energy`, and weakest on `Waves` items. Notably, this is also the order of the relative weights of the content areas within the exam; `MF`, `EN`, and `WA` items account for **50%**, **30%**, and **20%** of exam points respectively.
```{r}
#quantile(HSPhy_NextGen_SchoolSum$`EMDiff`)
HSPhy_NextGen_SchoolSum%>%
group_by(`EM Perf Stat`)%>%
summarise( `Mean MD%` = mean(`MD%`),
`Mean MD SD` = mean(`MD Diff SD`),
`Mean ERM%` = mean(`ERM%`),
`Mean ERM SD` = mean (`ERM Diff SD`))
HSPhy_NextGen_SchoolSum%>%
group_by(`EM Perf Stat`)%>%
summarise( `Mean MF%` = mean(`MF%`),
`Mean MF SD` = mean(`MF Diff SD`),
`Mean EN%` = mean(`EN%`),
`Mean EN SD` = mean (`EN Diff SD`),
`Mean WA%` = mean(`WA%`),
`Mean WA SD` = mean (`WA Diff SD`)
)
```
# Visualization
## Distribution of Performance Level %
When examining the statewide performance distribution, we can see from the right-skew
that it is rare for schools to have high percentages of students classified as `Not Meeting` expectations
and even rarer for schools to have high percentages of students classified as `Exceeding` expectations.
```{r}
HSPhy_NextGen_SchoolSum%>%
select(`E%`, `M%`, `PM%`, `NM%`)%>%
pivot_longer(c(1:4), names_to = "Performance Level", values_to = "% Students")%>%
ggplot( aes(x=`% Students`, color=`Performance Level`, fill=`Performance Level`)) +
geom_histogram(alpha=0.6, binwidth = 15) +
scale_fill_viridis(discrete=TRUE) +
scale_color_viridis(discrete=TRUE) +
#theme_ipsum() +
theme(
legend.position="none",
panel.spacing = unit(0.1, "lines"),
strip.text.x = element_text(size = 8)
) +
facet_wrap(~`Performance Level`)+
labs( y = "",
title = "School Performance Level Distribution",
x = "% Students at Performance Level",
caption = "NextGen HS Physics MCAS")
```
## Distribution of School Performance and Variability by Practice Cat
Although `Mathematics and Data` and `Evidence, Reasoning, and Modeling` items
have strong and equal weighting in the HS Introductory Physics exam, student performance distributions
are noticeably different across these practice categories.
```{r}
HSPhy_NextGen_SchoolSum%>%
select(`ERM%`, `MD%`)%>%
pivot_longer(c(1:2), names_to = "Practice Cat", values_to = "% Points")%>%
ggplot( aes(x=`% Points`, color=`Practice Cat`, fill=`Practice Cat`)) +
geom_histogram(alpha=0.6, binwidth = 3) +
scale_fill_viridis(discrete=TRUE) +
scale_color_viridis(discrete=TRUE) +
#theme_ipsum() +
theme(
panel.spacing = unit(0.1, "lines"),
strip.text.x = element_text(size = 8)
) +
facet_wrap(~`Practice Cat`)+
labs( y = "",
title = "School Performance by Practice Category",
x = "% Points Earned",
caption = "NextGen HS Physics MCAS")
#ggtitle("Practice Category Performance")
```
When considering the variability of a school's performance on items relative
to the state by `Practice Category`, `SD MD Diff`, and `SD ERM Diff`, we can
see that `Mathematics and Data` is skewed more to the right.
```{r}
HSPhy_NextGen_SchoolSum%>%
select(`ERM Diff SD`, `MD Diff SD`)%>%
pivot_longer(c(1:2), names_to = "Practice Cat", values_to = "SD Diff")%>%
ggplot( aes(x=`SD Diff`, color=`Practice Cat`, fill=`Practice Cat`)) +
geom_histogram(alpha=0.6, binwidth = 3) +
scale_fill_viridis(discrete=TRUE) +
scale_color_viridis(discrete=TRUE) +
# theme_ipsum() +
theme(
panel.spacing = unit(0.1, "lines"),
strip.text.x = element_text(size = 8)
) +
labs( y = "",
title = "School Performance Variation by Practice Category",
x = "SD Diff",
caption = "NextGen HS Physics MCAS") +
facet_wrap(~`Practice Cat`)
```
## Mathematics and Data vs. Evidence Reasoning and Modeling (Practice Category)
These images, seem to suggest that schools with the **highest** percentage of students classified as `Exceeding`
expectations on the MCAS have the **lowest** levels of variation in performance on
`Mathematics and Data` Items and schools with the **lowest** percentage of students classified as
`Exceeding` expectations on the MCAS have the **highest** levels of variation in performance
on `Mathematics and Data Items`.
```{r}
HSPhy_NextGen_SchoolSum%>%
select(`EM Perf Stat`, `ERM Diff SD`, `MD Diff SD` )%>%
pivot_longer(c(2:3), names_to = "Practice Cat", values_to = "SD Diff")%>%
ggplot( aes(x= `EM Perf Stat`, y=`SD Diff`, fill= `EM Perf Stat`)) +
geom_boxplot() +
scale_fill_viridis(discrete = TRUE, alpha=0.6) +
geom_jitter(color="black", size=0.4, alpha=0.9) +
theme(
plot.title = element_text(size=11),
axis.title.x=element_blank(),
#axis.text.x=element_blank()
) +
labs( y = "SD Diff",
title = "Student Performance Variation by Practice Category",
x = "",
caption = "NextGen HS Physics MCAS") +
facet_wrap(~`Practice Cat`)
HSPhy_NextGen_SchoolSum%>%
select(`EM Perf Stat`, `ERM Diff SD`, `MD Diff SD` )%>%
pivot_longer(c(2:3), names_to = "Practice Cat", values_to = "SD Diff")%>%
ggplot( aes(x= `Practice Cat`, y=`SD Diff`, fill= `Practice Cat`)) +
geom_boxplot() +
scale_fill_viridis(discrete = TRUE, alpha=0.6) +
geom_jitter(color="black", size=0.4, alpha=0.9) +
#theme_ipsum() +
theme(
plot.title = element_text(size=11),
axis.title.x=element_blank(),
axis.text.x=element_blank()
) +
labs( y = "SD Diff",
title = "Student Practice Cat. Variation by Achievement Level",
x = "",
caption = "NextGen HS Physics MCAS") +
#xlab("")+
facet_wrap(~`EM Perf Stat`)
```
These images, seem to suggest that students at all schools seem to have
more difficulty with `Mathematics and Data` items as compared to
`Evidence, Reasoning, and Modeling Items`.
```{r}
HSPhy_NextGen_SchoolSum%>%
select(`EM Perf Stat`, `ERM%`, `MD%` )%>%
pivot_longer(c(2:3), names_to = "Practice Cat", values_to = "%Points")%>%
ggplot( aes(x= `EM Perf Stat`, y=`%Points`, fill= `EM Perf Stat`)) +
geom_boxplot() +
scale_fill_viridis(discrete = TRUE, alpha=0.6) +
geom_jitter(color="black", size=0.4, alpha=0.9) +
#theme_ipsum() +
theme(
plot.title = element_text(size=11)
) +
labs( y = "%Points Earned",
title = "Student Practice Cat. Achievement by Performance Level",
x = "",
caption = "NextGen HS Physics MCAS") +
#xlab("")+
facet_wrap(~`Practice Cat`)
```
```{r}
HSPhy_NextGen_SchoolSum%>%
select(`EM Perf Stat`, `ERM%`, `MD%` )%>%
pivot_longer(c(2:3), names_to = "Practice Cat", values_to = "%Points")%>%
ggplot( aes(x= `Practice Cat`, y=`%Points`, fill= `Practice Cat`)) +
geom_boxplot() +
scale_fill_viridis(discrete = TRUE, alpha=0.6) +
geom_jitter(color="black", size=0.4, alpha=0.9) +
#theme_ipsum() +
theme(
plot.title = element_text(size=11)
) +
labs( y = "%Points Earned",
title = "Student Practice Cat. Achievement by Performance Level",
x = "",
caption = "NextGen HS Physics MCAS") +
#xlab("")+
facet_wrap(~`EM Perf Stat`, scale ="free_y")
# HSPhy_NextGen_SchoolSum%>%
# select(`EM Perf Stat`, `ERMDiff`, `MDDiff` )%>%
# pivot_longer(c(2:3), names_to = "Practice Cat", values_to = "%Points")%>%
# ggplot( aes(x= `EM Perf Stat`, y=`%Points`, fill= `EM Perf Stat`)) +
# geom_boxplot() +
# scale_fill_viridis(discrete = TRUE, alpha=0.6) +
# geom_jitter(color="black", size=0.4, alpha=0.9) +
# #theme_ipsum() +
# theme(
#
# plot.title = element_text(size=11)
# ) +
# labs( y = "%Points Earned",
# title = "Student Practice Cat. Achievement by Performance Level",
# x = "",
# caption = "NextGen HS Physics MCAS") +
# #xlab("")+
# facet_wrap(~`Practice Cat`)
# HSPhy_NextGen_SchoolSum%>%
# select(`EM Perf Stat`, `ERM%`, `MD%` )%>%
# pivot_longer(c(2:3), names_to = "Practice Cat", values_to = "%Points")%>%
# ggplot( aes(x= `Practice Cat`, y=`%Points`, fill= `Practice Cat`)) +
# geom_boxplot() +
# scale_fill_viridis(discrete = TRUE, alpha=0.6) +
# geom_jitter(color="black", size=0.4, alpha=0.9) +
# #theme_ipsum() +
# theme(
#
# plot.title = element_text(size=11)
# ) +
# labs( y = "%Points Earned",
# title = "Student Practice Cat. Achievement by Performance Level",
# x = "",
# caption = "NextGen HS Physics MCAS") +
# #xlab("")+
# facet_wrap(~`EM Perf Stat`, scale ="free_y")
# HSPhy_NextGen_SchoolSum %>%
# ggplot( aes(x= `EM Perf Stat`, y=`MD%`, fill= `EM Perf Stat`)) +
# geom_boxplot() +
# scale_fill_viridis(discrete = TRUE, alpha=0.6) +
# geom_jitter(color="black", size=0.4, alpha=0.9) +
# theme_ipsum() +
# theme(
# legend.position="none",
# plot.title = element_text(size=11)
# ) +
# ggtitle("A boxplot with jitter") +
# xlab("")
# HSPhy_NextGen_SchoolSum %>%
# ggplot( aes(x= `EM Perf Stat`, y=`MD Diff SD`, fill= `EM Perf Stat`)) +
# geom_boxplot() +
# scale_fill_viridis(discrete = TRUE, alpha=0.6) +
# geom_jitter(color="black", size=0.4, alpha=0.9) +
# theme_ipsum() +
# theme(
# legend.position="none",
# plot.title = element_text(size=11)
# ) +
# ggtitle("A boxplot with jitter") +
# xlab("")
#
# HSPhy_NextGen_SchoolSum %>%
# ggplot( aes(x= `EM Perf Stat`, y=`ERM Diff SD`, fill= `EM Perf Stat`)) +
# geom_boxplot() +
# scale_fill_viridis(discrete = TRUE, alpha=0.6) +
# geom_jitter(color="black", size=0.4, alpha=0.9) +
# theme_ipsum() +
# theme(
# legend.position="none",
# plot.title = element_text(size=11)
# ) +
# ggtitle("A boxplot with jitter") +
# xlab("")
#
# HSPhy_NextGen_SchoolSum %>%
# ggplot( aes(x= `EM Perf Stat`, y=`ERM%`, fill= `EM Perf Stat`)) +
# geom_boxplot() +
# scale_fill_viridis(discrete = TRUE, alpha=0.6) +
# geom_jitter(color="black", size=0.4, alpha=0.9) +
# theme_ipsum() +
# theme(
# legend.position="none",
# plot.title = element_text(size=11)
# ) +
# ggtitle("A boxplot with jitter") +
# xlab("")
```
## Distribution of School Performance and Variability by Reporting Cat
Here we can visualize the variability of a school's performance on items partitioned
by Content `Reporting Category` of `Motion and Forces`, `Energy`, and `Waves` via:
`MF%`/`SD MF Diff`, `EN%`/`SD EN Diff`, and `WA%`/`SD WA Diff`.
```{r}
HSPhy_NextGen_SchoolSum%>%
select(`EM Perf Stat`, `MF Diff SD`, `EN Diff SD`, `WA Diff SD` )%>%
pivot_longer(c(2:4), names_to = "Report Cat", values_to = "SD Diff")%>%
ggplot( aes(x=`SD Diff`, color=`Report Cat`, fill=`Report Cat`)) +
geom_histogram(alpha=0.6, binwidth = 3) +
scale_fill_viridis(discrete=TRUE) +
scale_color_viridis(discrete=TRUE) +
#theme_ipsum() +
theme(
panel.spacing = unit(0.1, "lines"),
strip.text.x = element_text(size = 8)
) +
labs( y = "",
title = "School Performance Variation by Content Reporting Category",
x = "SD Diff",
caption = "NextGen HS Physics MCAS") +
facet_wrap(~`Report Cat`)
```
```{r}
HSPhy_NextGen_SchoolSum%>%
select(`EM Perf Stat`, `MF%`, `EN%`, `WA%` )%>%
pivot_longer(c(2:4), names_to = "Report Cat", values_to = "% Points")%>%
ggplot( aes(x=`% Points`, color=`Report Cat`, fill=`Report Cat`)) +
geom_histogram(alpha=0.6, binwidth = 3) +
scale_fill_viridis(discrete=TRUE) +
scale_color_viridis(discrete=TRUE) +
#theme_ipsum() +
theme(
panel.spacing = unit(0.1, "lines"),
strip.text.x = element_text(size = 8)
) +
facet_wrap(~`Report Cat`)+
labs( y = "",
title = "Student Performance by Content Reporting Category",
x = "% Points Earned",
caption = "NextGen HS Physics MCAS")
#ggtitle("Practice Category Performance")
```
## Motion and Forces vs. Energy vs. Waves (Reporting Category)
These images suggest that most schools exhibit similar levels of variability
in performance relative to the state across all reporting categories. Schools with the ***lowest percentage*** of students `Exceeding` expectations exhibit ***high variability*** in performance across all content reporting
categories, but seem to have lower variability on `Waves` items.
```{r}
HSPhy_NextGen_SchoolSum%>%
select(`EM Perf Stat`, `MF Diff SD`, `EN Diff SD`, `WA Diff SD` )%>%
pivot_longer(c(2:4), names_to = "Report Cat", values_to = "SD Diff")%>%
ggplot( aes(x= `EM Perf Stat`, y=`SD Diff`, fill= `EM Perf Stat`)) +
geom_boxplot() +
scale_fill_viridis(discrete = TRUE, alpha=0.6) +
geom_jitter(color="black", size=0.4, alpha=0.9) +
theme(
plot.title = element_text(size=11),
axis.title.x=element_blank(),
axis.text.x=element_blank()
) +
labs( y = "SD Diff",
title = "School Performance Variation by Content Reporting Category",
x = "",
caption = "NextGen HS Physics MCAS") +
facet_wrap(~`Report Cat`)
HSPhy_NextGen_SchoolSum%>%
select(`EM Perf Stat`, `MF Diff SD`, `EN Diff SD`, `WA Diff SD` )%>%
pivot_longer(c(2:4), names_to = "Report Cat", values_to = "SD Diff")%>%
ggplot( aes(x= `Report Cat`, y=`SD Diff`, fill= `Report Cat`)) +
geom_boxplot() +
scale_fill_viridis(discrete = TRUE, alpha=0.6) +
geom_jitter(color="black", size=0.4, alpha=0.9) +
#theme_ipsum() +
theme(
plot.title = element_text(size=11),
axis.title.x=element_blank(),
axis.text.x=element_blank()
) +
labs( y = "SD Diff",
title = "School Content Reporting Cat. Variation by Achievement Level",
x = "",
caption = "NextGen HS Physics MCAS") +
#xlab("")+
facet_wrap(~`EM Perf Stat`)
HSPhy_NextGen_SchoolSum%>%
select(`EM Perf Stat`, `MF%`, `EN%`, `WA%` )%>%
pivot_longer(c(2:4), names_to = "Report Cat", values_to = "% Points")%>%
ggplot( aes(x= `Report Cat`, y=`% Points`, fill= `Report Cat`)) +
geom_boxplot() +
scale_fill_viridis(discrete = TRUE, alpha=0.6) +
geom_jitter(color="black", size=0.4, alpha=0.9) +
#theme_ipsum() +
theme(
plot.title = element_text(size=11),
axis.title.x=element_blank(),
axis.text.x=element_blank()
) +
labs( y = "Report Cat%",
title = "School Content Reporting Cat. Performance by Achievement Level",
x = "",
caption = "NextGen HS Physics MCAS") +
#xlab("")+
facet_wrap(~`EM Perf Stat`)
# HSPhy_NextGen_SchoolSum %>%
# ggplot( aes(x= `EM Perf Stat`, y=`MF%`, fill= `EM Perf Stat`)) +
# geom_boxplot() +
# scale_fill_viridis(discrete = TRUE, alpha=0.6) +
# geom_jitter(color="black", size=0.4, alpha=0.9) +
# theme_ipsum() +
# theme(
# legend.position="none",
# plot.title = element_text(size=11)
# ) +
# ggtitle("A boxplot with jitter") +
# xlab("")
#
# HSPhy_NextGen_SchoolSum %>%
# ggplot( aes(x= `EM Perf Stat`, y=`MF Diff SD`, fill= `EM Perf Stat`)) +
# geom_boxplot() +
# scale_fill_viridis(discrete = TRUE, alpha=0.6) +
# geom_jitter(color="black", size=0.4, alpha=0.9) +
# theme_ipsum() +
# theme(
# legend.position="none",
# plot.title = element_text(size=11)
# ) +
# ggtitle("A boxplot with jitter") +
# xlab("")
#
#
#
# HSPhy_NextGen_SchoolSum %>%
# ggplot( aes(x= `EM Perf Stat`, y=`EN%`, fill= `EM Perf Stat`)) +
# geom_boxplot() +
# scale_fill_viridis(discrete = TRUE, alpha=0.6) +
# geom_jitter(color="black", size=0.4, alpha=0.9) +
# theme_ipsum() +
# theme(
# legend.position="none",
# plot.title = element_text(size=11)
# ) +
# ggtitle("A boxplot with jitter") +
# xlab("")
#
# HSPhy_NextGen_SchoolSum %>%
# ggplot( aes(x= `EM Perf Stat`, y=`EN Diff SD`, fill= `EM Perf Stat`)) +
# geom_boxplot() +
# scale_fill_viridis(discrete = TRUE, alpha=0.6) +
# geom_jitter(color="black", size=0.4, alpha=0.9) +
# theme_ipsum() +
# theme(
# legend.position="none",
# plot.title = element_text(size=11)
# ) +
# ggtitle("A boxplot with jitter") +
# xlab("")
#
# HSPhy_NextGen_SchoolSum %>%
# ggplot( aes(x= `EM Perf Stat`, y=`WA%`, fill= `EM Perf Stat`)) +
# geom_boxplot() +
# scale_fill_viridis(discrete = TRUE, alpha=0.6) +
# geom_jitter(color="black", size=0.4, alpha=0.9) +
# theme_ipsum() +
# theme(
# legend.position="none",
# plot.title = element_text(size=11)
# ) +
# ggtitle("A boxplot with jitter") +
# xlab("")
#
# HSPhy_NextGen_SchoolSum %>%
# ggplot( aes(x= `EM Perf Stat`, y=`WA Diff SD`, fill= `EM Perf Stat`)) +
# geom_boxplot() +
# scale_fill_viridis(discrete = TRUE, alpha=0.6) +
# geom_jitter(color="black", size=0.4, alpha=0.9) +
# theme_ipsum() +
# theme(
# legend.position="none",
# plot.title = element_text(size=11)
# ) +
# ggtitle("A boxplot with jitter") +
# xlab("")
```
# Hypothesis Testing
```{r}
HSPhy_NextGen_SchoolSum<-HSPhy_NextGen_SchoolSum%>%
ungroup()
HSPhy_NextGen_SchoolSum<-HSPhy_NextGen_SchoolSum%>%
mutate(`EorM%` = `E%` + `M%`)
```
## Anova: SD-Diff ERM, MD
```{r}
HSPhy_NextGen_SchoolSum
ANOVA_MD <- aov(`MD Diff SD` ~ `EM Perf Stat`, data=HSPhy_NextGen_SchoolSum)
summary(ANOVA_MD)
ANOVA_ERM <- aov(`ERM Diff SD` ~ `EM Perf Stat`, data=HSPhy_NextGen_SchoolSum)
summary(ANOVA_ERM)
ANOVA_WA <- aov(`WA Diff SD` ~ `EM Perf Stat`, data=HSPhy_NextGen_SchoolSum)
summary(ANOVA_WA)
ANOVA_EN <- aov(`EN Diff SD` ~ `EM Perf Stat`, data=HSPhy_NextGen_SchoolSum)
summary(ANOVA_EN)
ANOVA_MF <- aov(`MF Diff SD` ~ `EM Perf Stat`, data=HSPhy_NextGen_SchoolSum)
summary(ANOVA_MF)
HSPhy_NextGen_SchoolSum%>%
select(`EM Perf Stat`, `MF Diff SD`, `EN Diff SD`, `WA Diff SD` )%>%
pivot_longer(c(2:4), names_to = "Report Cat", values_to = "SD Diff")%>%
group_by(`Report Cat`, `EM Perf Stat`)%>%
summarize(`SD SD Diff` = sd(`SD Diff`, na.rm = TRUE))
HSPhy_NextGen_SchoolSum%>%
select(`EM Perf Stat`, `ERM Diff SD`, `MD Diff SD` )%>%
pivot_longer(c(2:3), names_to = "Practice Cat", values_to = "SD Diff")%>%
group_by(`Practice Cat`, `EM Perf Stat`)%>%
summarize(`SD SD Diff` = sd(`SD Diff`, na.rm = TRUE))
```
```{r}
ggplot(data = HSPhy_NextGen_SchoolSum, aes(x = `ERM Diff SD`, y = (`EorM%`))) +
geom_point() +
geom_smooth(method="lm", se=T)
ggplot(data = HSPhy_NextGen_SchoolSum, aes(x = `MD Diff SD`, y = (`EorM%`))) +
geom_point() +
geom_smooth(method="lm", se=T)
ggplot(data = HSPhy_NextGen_SchoolSum, aes(x = `MD Diff SD`, y = (`E%`))) +
geom_point() +
geom_smooth(method="lm", se=T)
ggplot(data = HSPhy_NextGen_SchoolSum, aes(x = `ERM Diff SD`, y = (`E%`))) +
geom_point() +
geom_smooth(method="lm", se=T)
ggplot(data = HSPhy_NextGen_SchoolSum, aes(x = (`WA Diff SD`), y = ((`E%`)))) +
geom_point() +
geom_smooth(method="lm", se=T)
```
## MD Diff Alone
```{r}
fit_md = lm(`EorM%` ~ (`MD Diff SD`), data = HSPhy_NextGen_SchoolSum)
summary(fit_md)
```
## MD and ERM Diff
This states that MD is significant but ERM is not statistically significant?
```{r}
fit_md_erm = lm(`EorM%` ~ (`ERM Diff SD` + `MD Diff SD`), data = HSPhy_NextGen_SchoolSum)
summary(fit_md_erm)
```
## Reporting Category DIFF alone and with interactions
### EN
```{r}
fit_en = lm(`EorM%` ~ (`EN Diff SD`), data = HSPhy_NextGen_SchoolSum)
summary(fit_en)
```
### MF
```{r}
fit_mf = lm(`EorM%` ~ (`MF Diff SD`), data = HSPhy_NextGen_SchoolSum)
summary(fit_mf)
```
### WA
```{r}
fit_wa = lm(`EorM%` ~ (`WA Diff SD`), data = HSPhy_NextGen_SchoolSum)
summary(fit_wa)
```
### MF/EN
```{r}
fit_mf_en = lm(`EorM%` ~ (`MF Diff SD`) + `EN Diff SD` + `MF Diff SD`*`EN Diff SD`, data = HSPhy_NextGen_SchoolSum)
summary(fit_mf_en)
```
### MF/WA
```{r}
fit_mf_wa = lm(`EorM%` ~ (`MF Diff SD`) + `WA Diff SD` + `MF Diff SD`*`WA Diff SD`, data = HSPhy_NextGen_SchoolSum)
summary(fit_mf_wa)
```
## Practice Cat Interacting with Reporting Cat
### MD/WA
```{r}
fit_md_wa = lm(`EorM%` ~ (`MD Diff SD`) + `WA Diff SD` + `MD Diff SD`*`WA Diff SD`, data = HSPhy_NextGen_SchoolSum)
summary(fit_md_wa)
```
### MD/MF
```{r}
fit_md_mf = lm(`EorM%` ~ (`MD Diff SD`) + `MF Diff SD` + `MD Diff SD`*`MF Diff SD`, data = HSPhy_NextGen_SchoolSum)
summary(fit_md_mf)
```
### MD/EN
```{r}
fit_md_en = lm(`EorM%` ~ (`MD Diff SD`) + `EN Diff SD` + `MD Diff SD`*`EN Diff SD`, data = HSPhy_NextGen_SchoolSum)
summary(fit_md_en)
```
## Practice Cat %
### MD/ERM
```{r}
fit_md_erm_percent = lm(`EorM%` ~ `MD%` + `ERM%` + `MD%`*`ERM%`, data = HSPhy_NextGen_SchoolSum)
summary(fit_md_erm_percent)
```
### MD/WA
```{r}
fit_md_wa_percent = lm(`EorM%` ~ `MD%` + `WA%` + `MD%`*`WA%`, data = HSPhy_NextGen_SchoolSum)
summary(fit_md_wa_percent)
```
### ERM/WA
```{r}
fit_erm_wa_percent = lm(`EorM%` ~ `ERM%` + `WA%` + `ERM%`*`WA%`, data = HSPhy_NextGen_SchoolSum)
summary(fit_erm_wa_percent)
```
### MD, ERM, MF
```{r}
fit_md_erm_mf_percent = lm(`EorM%` ~ `ERM%` + `MD%` + `MF%` + `MD%`*`ERM%`+ `MD%`*`MF%` + `ERM%`*`MF%`, data = HSPhy_NextGen_SchoolSum)
summary(fit_md_erm_mf_percent)
```
### MD, ERM, WA
```{r}
fit_md_erm_wa_percent = lm(`EorM%` ~ `ERM%` + `MD%` + `WA%` + `MD%`*`ERM%`+ `MD%`*`WA%` + `ERM%`*`WA%`, data = HSPhy_NextGen_SchoolSum)
summary(fit_md_erm_wa_percent)
```
# Scatter Plots
## MD + WA
```{r}
fit_md_wa$coefficients
summary(fit_md_wa)
ggplot(data = HSPhy_NextGen_SchoolSum, aes(x = -10.019434*`MD Diff SD` + -1.7848*`WA Diff SD` + 0.4548*`MD Diff SD`*`WA Diff SD` + 104.1191, y = `EorM%`)) +
geom_point() +
geom_smooth(method="lm", se=T)
ggplot(data = HSPhy_NextGen_SchoolSum, aes(x = `MD Diff SD` + `WA Diff SD`, y = `EorM%`)) +
geom_point() +
geom_smooth(method="lm", se=T)
```
```{r}
ggplot(data = HSPhy_NextGen_SchoolSum, aes(x = 1.290583*`MD%` + -0.015233*`WA%` + 0.006053*`WA%`*`MD%` + -35.163765 , y = `EorM%`)) +
geom_point() +
geom_smooth(method="lm", se=T)
```
```{r}
#fit_erm = lm(`E%` ~ `ERM Diff SD`, data = HSPhy_NextGen_SchoolSum)
#summary(fit_erm)
#fit_erm_md = lm(`E%` ~ log(`MD Diff SD`) + log(`ERM Diff SD`) + log(`MD Diff SD`)*log(`ERM Diff SD`), data = HSPhy_NextGen_SchoolSum)
#summary(fit_erm_md)
#fit_md_percent = lm(`E%` ~ log(`MD%`) + log(`ERM%`) + log(`MD%`)*log(`ERM%`), data = HSPhy_NextGen_SchoolSum)
#summary(fit_md_percent)
HSPhy_NextGen_SchoolSum%>%
select(`MD%`, `E%`)
ggplot(data = HSPhy_NextGen_SchoolSum, aes(x = `MD%`, y = log(`E%`))) +
geom_point() +
geom_smooth(method="lm", se=T)
ggplot(data = HSPhy_NextGen_SchoolSum, aes(x = `ERM%`, y = log(`E%`))) +
geom_point() +
geom_smooth(method="lm", se=T)
ggplot(data = HSPhy_NextGen_SchoolSum, aes(x = log(`MD Diff SD`), y = (`E%`))) +
geom_point() +
geom_smooth(method="lm", se=T)
```
```{r}
#HSPhy_NextGen_SchoolSum.isna()
fit_wa = lm((`E%`) ~ log(`WA%`), data = HSPhy_NextGen_SchoolSum)
summary(fit_wa)
fit_md = lm((`E%`) ~ log(`MD%`), data = HSPhy_NextGen_SchoolSum)
summary(fit_md)
#fit_erm = lm(`E%` ~ `ERM Diff SD`, data = HSPhy_NextGen_SchoolSum)
#summary(fit_erm)
#fit_erm_md = lm(`E%` ~ log(`MD Diff SD`) + log(`ERM Diff SD`) + log(`MD Diff SD`)*log(`ERM Diff SD`), data = HSPhy_NextGen_SchoolSum)
#summary(fit_erm_md)
#fit_md_percent = lm(`E%` ~ log(`MD%`) + log(`ERM%`) + log(`MD%`)*log(`ERM%`), data = HSPhy_NextGen_SchoolSum)
#summary(fit_md_percent)
#HSPhy_NextGen_SchoolSum%>%
# select(`MD%`, `E%`)
ggplot(data = HSPhy_NextGen_SchoolSum, aes(x = log(`WA%`), y = log(`E%`))) +
geom_point() +
geom_smooth(method="lm", se=T)
#ggplot(data = HSPhy_NextGen_SchoolSum, aes(x = `ERM%`, y = log(`E%`))) +
# geom_point() +
# geom_smooth(method="lm", se=T)
#ggplot(data = HSPhy_NextGen_SchoolSum, aes(x = log(`MD Diff SD`), y = (`E%`))) +
# geom_point() +
# geom_smooth(method="lm", se=T)
```
# References